Encoding Prior Knowledge with Eigenword Embeddings
نویسندگان
چکیده
Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.
منابع مشابه
Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement
Distributional word representations are widely used in NLP tasks. These representations are based on an assumption that words with a similar context tend to have a similar meaning. To improve the quality of the context-based embeddings, many researches have explored how to make full use of existing lexical resources. In this paper, we argue that while we incorporate the prior knowledge with con...
متن کاملImproving Lexical Embeddings with Semantic Knowledge
Word embeddings learned on unlabeled data are a popular tool in semantics, but may not capture the desired semantics. We propose a new learning objective that incorporates both a neural language model objective (Mikolov et al., 2013) and prior knowledge from semantic resources to learn improved lexical semantic embeddings. We demonstrate that our embeddings improve over those learned solely on ...
متن کاملConstraining Word Embeddings by Prior Knowledge - Application to Medical Information Retrieval
Word embedding has been used in many NLP tasks and showed some capability to capture semantic features. It has also been used in several recent studies in IR. However, word embeddings trained in unsupervised manner may fail to capture some of the semantic relations in a specific area (e.g. healthcare). In this paper, we leverage the existing knowledge (word relations) in the medical domain to c...
متن کاملConvolutional Neural Network Based Semantic Tagging with Entity Embeddings
Unsupervised word embeddings provide rich linguistic and conceptual information about words. However, they may provide weak information about domain specific semantic relations for certain tasks such as semantic parsing of natural language queries, where such information about words or phrases can be valuable. To encode the prior knowledge about the semantic word relations, we extended the neur...
متن کاملEnriching Word Embeddings Using Knowledge Graph for Semantic Tagging in Conversational Dialog Systems
Unsupervised word embeddings provide rich linguistic and conceptual information about words. However, they may provide weak information about domain specific semantic relations for certain tasks such as semantic parsing of natural language queries, where such information about words can be valuable. To encode the prior knowledge about the semantic word relations, we present new method as follow...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- TACL
دوره 4 شماره
صفحات -
تاریخ انتشار 2016